Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
null (Ed.)Abstract: As the web of biodiversity knowledge continues to grow and become more complex, practical questions arise: How do we publish and review works that use big and complex datasets? How do we keep track of data use across biodiversity data networks? How do we keep our digital data available for the next 50 years? In this iDigBio lunch seminar, Jorrit Poelen works towards answering these questions through use cases taken from Global Biotic Interactions (GloBI, https://globalbioticinteractions.org), Terrestrial Parasite Tracker TCN (TPT, https://parasitetracker.org) and Preston (https://preston.guoda.bio), a biodiversity data tracker.more » « less
-
Global Biotic Interactions (GloBI, https://globalbioticinteractions.org) uses frugal and pragmatic methods to make openly available species interaction datasets (e.g., parasite-host, predator-prey, plant-pollinator) easier to find and reuse. Since 2013, GloBI increased the reach of existing datasets, facilitated research, improved data integration methods and provided dataset reviews. In this talk, GloBI is introduced and various reuse examples are presented to discuss the question: Why should we bother to reuse existing (species-interaction) datasets?more » « less
-
As R is becoming a standard research tool, a basic question remains: how to reliably reference data used in R programs? Almost any R user can sympathize with the problems of having local paths (e.g., read.csv("path/to/file.csv") ), and that URLs aren't much better (or worse when bandwidth is limiting). R developers have largely sidestepped this problem by packaging the data in the code, which has made datasets like "iris" and "mcars" famous. However, this approach is of little help to any real-world data. With help of code examples and R package "contentid", we show how to write R code that is agnostic to the location of data, works with local / remote data, and ensures that the requested data is used.more » « less
-
The deluge of digital biodiversity datasets unleashed through institutional, national and global infrastructures brings up an inconvenient truth: internet-connected infrastructures are in a constant state of flux while preservation and integration of digital knowledge are often afterthoughts. Rather than taking digital amnesia for granted, we examine examples of durable and frugal digital data preservation and integration methods. Examples include tracking external datasets, creating verifiable data citations, cross-publishing and cross-linking datasets, reproducing data-integration processes, and distributing large data archives across poor, or nonexistent, internet connections. Topics include cryptographic hashes, Provenance Ontology, content-addressed storage, Unix philosophy, and offline first design as applied in projects like Preston (https://preston.guoda.bio) and Global Biotic Interactions (https://globalbioticinteractions.org). The examples are then related to best practices applied by proven knowledge-preservation experts: librarians and curators.more » « less
An official website of the United States government

Full Text Available